Overview

Dataset statistics

Number of variables16
Number of observations150451
Missing cells301974
Missing cells (%)12.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.4 MiB
Average record size in memory128.0 B

Variable types

NUM10
CAT6

Warnings

Match_Date has a high cardinality: 450 distinct values High cardinality
Player_Out is highly correlated with StrikerHigh correlation
Striker is highly correlated with Player_OutHigh correlation
Striker_Batting_Position has 13861 (9.2%) missing values Missing
Player_Out has 143013 (95.1%) missing values Missing
Fielders has 145100 (96.4%) missing values Missing
Batsman_Runs_Scored has 61151 (40.6%) zeros Zeros

Reproduction

Analysis started2021-11-09 10:42:53.332133
Analysis finished2021-11-09 10:43:17.694374
Duration24.36 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

MatcH_id
Real number (ℝ≥0)

Distinct636
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean636207.5251
Minimum335987
Maximum1082650
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:17.829438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum335987
5-th percentile336019
Q1419154
median548382
Q3829742
95-th percentile1082617
Maximum1082650
Range746663
Interquartile range (IQR)410588

Descriptive statistics

Standard deviation234362.2892
Coefficient of variation (CV)0.368373966
Kurtosis-0.8701765653
Mean636207.5251
Median Absolute Deviation (MAD)156170
Skewness0.5973540118
Sum9.571805835e+10
Variance5.49256826e+10
MonotocityNot monotonic
2021-11-09T11:43:17.990633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3921952670.2%
 
10826252630.2%
 
7293202620.2%
 
8297422610.2%
 
5980092610.2%
 
8298162590.2%
 
4191262590.2%
 
8297462580.2%
 
5980222580.2%
 
4191472570.2%
 
Other values (626)14784698.3%
 
ValueCountFrequency (%) 
3359872250.1%
 
3359882480.2%
 
3359892190.1%
 
3359902460.2%
 
3359912400.2%
 
ValueCountFrequency (%) 
10826502480.2%
 
10826492070.1%
 
10826481570.1%
 
10826472520.2%
 
10826462490.2%
 

Over_id
Real number (ℝ≥0)

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.14270427
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:18.274331image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median10
Q315
95-th percentile19
Maximum20
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.674254966
Coefficient of variation (CV)0.5594420202
Kurtosis-1.18115921
Mean10.14270427
Median Absolute Deviation (MAD)5
Skewness0.05347944711
Sum1525980
Variance32.19716942
MonotocityNot monotonic
2021-11-09T11:43:18.417992image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%) 
180925.4%
 
280175.3%
 
379315.3%
 
479015.3%
 
578735.2%
 
678645.2%
 
778265.2%
 
877995.2%
 
977755.2%
 
1077265.1%
 
Other values (10)7164747.6%
 
ValueCountFrequency (%) 
180925.4%
 
280175.3%
 
379315.3%
 
479015.3%
 
578735.2%
 
ValueCountFrequency (%) 
2056483.8%
 
1965424.3%
 
1869794.6%
 
1772334.8%
 
1673324.9%
 

Ball_id
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.616639304
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:18.534189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q35
95-th percentile6
Maximum9
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.807638431
Coefficient of variation (CV)0.4998116425
Kurtosis-1.081746879
Mean3.616639304
Median Absolute Deviation (MAD)2
Skewness0.09679432471
Sum544127
Variance3.267556698
MonotocityNot monotonic
2021-11-09T11:43:18.627618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
12438816.2%
 
22433016.2%
 
32426116.1%
 
42420216.1%
 
52412316.0%
 
62404116.0%
 
743242.9%
 
86790.5%
 
91030.1%
 
ValueCountFrequency (%) 
12438816.2%
 
22433016.2%
 
32426116.1%
 
42420216.1%
 
52412316.0%
 
ValueCountFrequency (%) 
91030.1%
 
86790.5%
 
743242.9%
 
62404116.0%
 
52412316.0%
 

Innings_No
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
78024 
2
72346 
3
 
43
4
 
38
ValueCountFrequency (%) 
17802451.9%
 
27234648.1%
 
343< 0.1%
 
438< 0.1%
 
2021-11-09T11:43:18.757626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-11-09T11:43:18.857390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:18.962033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Team_Batting
Categorical

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
7
16988 
2
16140 
4
15991 
3
15821 
6
15481 
Other values (16)
70030 
ValueCountFrequency (%) 
71698811.3%
 
21614010.7%
 
41599110.6%
 
31582110.5%
 
61548110.3%
 
11541610.2%
 
5138459.2%
 
890336.0%
 
1173794.9%
 
1054433.6%
 
Other values (11)1891412.6%
 
2021-11-09T11:43:19.088615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-11-09T11:43:19.270609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length27
Median length1
Mean length2.711779915
Min length1

Team_Bowling
Categorical

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
7
16704 
2
16416 
4
15776 
1
15585 
6
15534 
Other values (16)
70436 
ValueCountFrequency (%) 
71670411.1%
 
21641610.9%
 
41577610.5%
 
11558510.4%
 
61553410.3%
 
31549310.3%
 
5141779.4%
 
890386.0%
 
1172764.8%
 
1054573.6%
 
Other values (11)1899512.6%
 
2021-11-09T11:43:19.406484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-11-09T11:43:19.539827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length27
Median length1
Mean length2.713069371
Min length1

Striker_Batting_Position
Real number (ℝ≥0)

MISSING

Distinct11
Distinct (%)< 0.1%
Missing13861
Missing (%)9.2%
Infinite0
Infinite (%)0.0%
Mean3.583637162
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:19.645254image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum11
Range10
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.145089808
Coefficient of variation (CV)0.5985789605
Kurtosis0.1815855545
Mean3.583637162
Median Absolute Deviation (MAD)1
Skewness0.7946178026
Sum489489
Variance4.601410282
MonotocityNot monotonic
2021-11-09T11:43:19.751207image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
22558417.0%
 
12547616.9%
 
32343915.6%
 
42043513.6%
 
51662711.1%
 
6108447.2%
 
766334.4%
 
838342.5%
 
921261.4%
 
1011600.8%
 
(Missing)138619.2%
 
ValueCountFrequency (%) 
12547616.9%
 
22558417.0%
 
32343915.6%
 
42043513.6%
 
51662711.1%
 
ValueCountFrequency (%) 
114320.3%
 
1011600.8%
 
921261.4%
 
838342.5%
 
766334.4%
 

Extra_Type
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
No Extras
142255 
wides
 
4153
legbyes
 
2357
noballs
 
579
Wides
 
422
Other values (5)
 
685
ValueCountFrequency (%) 
No Extras14225594.6%
 
wides41532.8%
 
legbyes23571.6%
 
noballs5790.4%
 
Wides4220.3%
 
byes3790.3%
 
Legbyes2330.2%
 
Noballs39< 0.1%
 
Byes33< 0.1%
 
penalty1< 0.1%
 
2021-11-09T11:43:19.861720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2021-11-09T11:43:19.942189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:20.132470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length9
Mean length8.822015141
Min length4

Batsman_Runs_Scored
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.22219859
Minimum0
Maximum6
Zeros61151
Zeros (%)40.6%
Memory size1.1 MiB
2021-11-09T11:43:20.241868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.594310883
Coefficient of variation (CV)1.3044614
Kurtosis1.697061955
Mean1.22219859
Median Absolute Deviation (MAD)1
Skewness1.596954927
Sum183881
Variance2.541827193
MonotocityNot monotonic
2021-11-09T11:43:20.340886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
06115140.6%
 
15549536.9%
 
41702611.3%
 
297056.5%
 
665204.3%
 
35090.3%
 
545< 0.1%
 
ValueCountFrequency (%) 
06115140.6%
 
15549536.9%
 
297056.5%
 
35090.3%
 
41702611.3%
 
ValueCountFrequency (%) 
665204.3%
 
545< 0.1%
 
41702611.3%
 
35090.3%
 
297056.5%
 

Out_type
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
Not Applicable
143013 
caught
 
3678
bowled
 
1382
run out
 
755
Keeper Catch
 
695
Other values (6)
 
928
ValueCountFrequency (%) 
Not Applicable14301395.1%
 
caught36782.4%
 
bowled13820.9%
 
run out7550.5%
 
Keeper Catch6950.5%
 
lbw4550.3%
 
stumped2430.2%
 
caught and bowled2110.1%
 
retired hurt9< 0.1%
 
hit wicket9< 0.1%
 
2021-11-09T11:43:20.464945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2021-11-09T11:43:20.598797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length21
Median length14
Mean length13.645898
Min length3

Match_Date
Categorical

HIGH CARDINALITY

Distinct450
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
4/23/2009
 
513
4/29/2017
 
508
05-05-2012
 
506
4/16/2013
 
506
5/19/2014
 
503
Other values (445)
147915 
ValueCountFrequency (%) 
4/23/20095130.3%
 
4/29/20175080.3%
 
05-05-20125060.3%
 
4/16/20135060.3%
 
5/19/20145030.3%
 
4/29/20095020.3%
 
3/25/20105020.3%
 
3/21/20105010.3%
 
5/17/20095010.3%
 
4/16/20175000.3%
 
Other values (440)14540996.6%
 
2021-11-09T11:43:20.792897image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-11-09T11:43:20.949507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length9
Mean length9.371569481
Min length9

Striker
Real number (ℝ≥0)

HIGH CORRELATION

Distinct460
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean136.5370386
Minimum1
Maximum497
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:21.069883image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q140
median96
Q3208
95-th percentile376
Maximum497
Range496
Interquartile range (IQR)168

Descriptive statistics

Standard deviation120.5342396
Coefficient of variation (CV)0.8827951797
Kurtosis-0.3033486636
Mean136.5370386
Median Absolute Deviation (MAD)75
Skewness0.8738477794
Sum20542134
Variance14528.50291
MonotocityNot monotonic
2021-11-09T11:43:21.204336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
834942.3%
 
4034332.3%
 
2133692.2%
 
5732742.2%
 
4230052.0%
 
4629602.0%
 
18729021.9%
 
2026801.8%
 
8526021.7%
 
16225311.7%
 
Other values (450)12020179.9%
 
ValueCountFrequency (%) 
113260.9%
 
221811.4%
 
31290.1%
 
411010.7%
 
5840.1%
 
ValueCountFrequency (%) 
49712< 0.1%
 
49626< 0.1%
 
49511< 0.1%
 
49131< 0.1%
 
4902< 0.1%
 

Non_Striker
Real number (ℝ≥0)

Distinct457
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean135.6234189
Minimum1
Maximum497
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:21.340026image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q140
median96
Q3208
95-th percentile375
Maximum497
Range496
Interquartile range (IQR)168

Descriptive statistics

Standard deviation120.0704115
Coefficient of variation (CV)0.8853221104
Kurtosis-0.2513749151
Mean135.6234189
Median Absolute Deviation (MAD)73
Skewness0.8946355224
Sum20404679
Variance14416.90371
MonotocityNot monotonic
2021-11-09T11:43:21.468351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4036352.4%
 
2134832.3%
 
833502.2%
 
5733062.2%
 
4232482.2%
 
4628481.9%
 
8528311.9%
 
18726721.8%
 
16224581.6%
 
2024321.6%
 
Other values (447)12018879.9%
 
ValueCountFrequency (%) 
113830.9%
 
222631.5%
 
31420.1%
 
411290.8%
 
569< 0.1%
 
ValueCountFrequency (%) 
49714< 0.1%
 
49635< 0.1%
 
49510< 0.1%
 
49137< 0.1%
 
4901< 0.1%
 

Bowler
Real number (ℝ≥0)

Distinct355
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean194.0870981
Minimum1
Maximum497
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:21.596705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15
Q177
median174
Q3310
95-th percentile416
Maximum497
Range496
Interquartile range (IQR)233

Descriptive statistics

Standard deviation132.9989497
Coefficient of variation (CV)0.6852539453
Kurtosis-1.066164182
Mean194.0870981
Median Absolute Deviation (MAD)107
Skewness0.3924450847
Sum29200598
Variance17688.72063
MonotocityNot monotonic
2021-11-09T11:43:21.730970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5029892.0%
 
13627031.8%
 
19426941.8%
 
1426361.8%
 
6725941.7%
 
20123591.6%
 
1522761.5%
 
8121611.4%
 
9421591.4%
 
2921131.4%
 
Other values (345)12576783.6%
 
ValueCountFrequency (%) 
12800.2%
 
43230.2%
 
563< 0.1%
 
82640.2%
 
917991.2%
 
ValueCountFrequency (%) 
4971820.1%
 
4951110.1%
 
49421< 0.1%
 
493820.1%
 
49224< 0.1%
 

Player_Out
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct435
Distinct (%)5.8%
Missing143013
Missing (%)95.1%
Infinite0
Infinite (%)0.0%
Mean148.6332347
Minimum1
Maximum497
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:21.860359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q141
median107
Q3236
95-th percentile391
Maximum497
Range496
Interquartile range (IQR)195

Descriptive statistics

Standard deviation124.7608827
Coefficient of variation (CV)0.839387523
Kurtosis-0.6106458837
Mean148.6332347
Median Absolute Deviation (MAD)80
Skewness0.7270238885
Sum1105534
Variance15565.27786
MonotocityNot monotonic
2021-11-09T11:43:21.986773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
211340.1%
 
401310.1%
 
571290.1%
 
461280.1%
 
81180.1%
 
881170.1%
 
421090.1%
 
171070.1%
 
271010.1%
 
1871000.1%
 
Other values (425)62644.2%
 
(Missing)14301395.1%
 
ValueCountFrequency (%) 
153< 0.1%
 
2980.1%
 
39< 0.1%
 
449< 0.1%
 
57< 0.1%
 
ValueCountFrequency (%) 
4971< 0.1%
 
4963< 0.1%
 
4953< 0.1%
 
4912< 0.1%
 
4892< 0.1%
 

Fielders
Real number (ℝ≥0)

MISSING

Distinct400
Distinct (%)7.5%
Missing145100
Missing (%)96.4%
Infinite0
Infinite (%)0.0%
Mean155.3918894
Minimum1
Maximum497
Zeros0
Zeros (%)0.0%
Memory size1.1 MiB
2021-11-09T11:43:22.119203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10
Q147
median111
Q3237.5
95-th percentile386.5
Maximum497
Range496
Interquartile range (IQR)190.5

Descriptive statistics

Standard deviation125.126355
Coefficient of variation (CV)0.8052309261
Kurtosis-0.6340701806
Mean155.3918894
Median Absolute Deviation (MAD)81
Skewness0.6972704448
Sum831502
Variance15656.60471
MonotocityNot monotonic
2021-11-09T11:43:22.270646image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
881270.1%
 
201260.1%
 
461150.1%
 
1101030.1%
 
21960.1%
 
17840.1%
 
183820.1%
 
57790.1%
 
5375< 0.1%
 
874< 0.1%
 
Other values (390)43902.9%
 
(Missing)14510096.4%
 
ValueCountFrequency (%) 
123< 0.1%
 
250< 0.1%
 
34< 0.1%
 
429< 0.1%
 
52< 0.1%
 
ValueCountFrequency (%) 
4975< 0.1%
 
4962< 0.1%
 
4951< 0.1%
 
4913< 0.1%
 
4901< 0.1%
 

Interactions

2021-11-09T11:43:01.929799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:02.401392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:02.525116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:02.653165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:02.766834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:02.899938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.031681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.190937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.305937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.421818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.538247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.665309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.814550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:03.963242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.102720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.242431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.373027image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.512484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.647323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.797163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:04.929606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.104030image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.243253image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.371579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.490901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.625499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.743102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.862454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:05.979338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.188672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.304636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.427605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.539614image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.664041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.771717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:06.902188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.008999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.131771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.275477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.438754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.597465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.793467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:07.957244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:08.117390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:08.253218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:08.417853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:08.604534image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:08.770700image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:08.891287image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.014473image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.196200image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.327993image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.456851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.599411image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.728479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.855637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:09.963417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.078851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.185824image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.294482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.415630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.576128image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.767191image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:10.908157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.045485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.309572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.412413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.516059image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.681530image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.806695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:11.913235image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.048249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.228115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.388462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.500855image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.643077image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.748065image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.847905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:12.954052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.057580image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.166923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.277454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.393281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.507437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.611713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.740206image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:13.904571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.058433image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.189527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.319260image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.441094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.564032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.701248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.843151image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:14.999864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:15.174695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:15.279631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:15.403176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:15.516003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:15.623856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-11-09T11:43:22.390350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-11-09T11:43:22.584785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-11-09T11:43:23.024527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-11-09T11:43:23.229106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-11-09T11:43:23.457753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-11-09T11:43:16.043946image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:16.548396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:17.162390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-09T11:43:17.390197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

MatcH_idOver_idBall_idInnings_NoTeam_BattingTeam_BowlingStriker_Batting_PositionExtra_TypeBatsman_Runs_ScoredOut_typeMatch_DateStrikerNon_StrikerBowlerPlayer_OutFielders
05980281561526.0No Extras4Not Applicable4/20/201327710483NaNNaN
15980281411525.0No Extras1Not Applicable4/20/20131046346NaNNaN
25980281421523.0No Extras1Not Applicable4/20/20136104346NaNNaN
35980281431525.0No Extras1Not Applicable4/20/20131046346NaNNaN
45980281441523.0No Extras0Not Applicable4/20/20136104346NaNNaN
55980281451523.0No Extras4Not Applicable4/20/20136104346NaNNaN
65980281461523.0No Extras2Not Applicable4/20/20136104346NaNNaN
75980281311525.0No Extras1Not Applicable4/20/2013104683NaNNaN
85980281321523.0No Extras4Not Applicable4/20/2013610483NaNNaN
95980281331523.0No Extras1Not Applicable4/20/2013610483NaNNaN

Last rows

MatcH_idOver_idBall_idInnings_NoTeam_BattingTeam_BowlingStriker_Batting_PositionExtra_TypeBatsman_Runs_ScoredOut_typeMatch_DateStrikerNon_StrikerBowlerPlayer_OutFielders
1504415980281621525.0No Extras0Keeper Catch4/20/201310427781104.0239.0
1504425980281631527.0No Extras0Not Applicable4/20/201331027781NaNNaN
1504435980281641527.0No Extras2Not Applicable4/20/201331027781NaNNaN
1504445980281651527.0No Extras0Not Applicable4/20/201331027781NaNNaN
1504455980281661527.0No Extras1Not Applicable4/20/201331027781NaNNaN
1504465980281511525.0No Extras1Not Applicable4/20/2013104683NaNNaN
1504475980281521523.0No Extras2Not Applicable4/20/2013610483NaNNaN
1504485980281531523.0No Extras4Not Applicable4/20/2013610483NaNNaN
1504495980281541523.0No Extras0caught4/20/20136104836.0349.0
1504505980281551526.0No Extras0Not Applicable4/20/201327710483NaNNaN